"TalkPrinting": Improving Speaker Recognition by Modeling Stylistic Features
نویسندگان
چکیده
Automatic speaker recognition is an important technology for intelligence gathering, law enforcement, and audio mining. Conventional speaker recognition systems, which are based on independent short-term spectral samples, suffer from a lack of noise robustness and are unable to model a speaker’s idiosyncratic stylistic features. This paper describes “TalkPrinting”, a program of research aimed at adding such stylistic features to conventional systems. Results on three preliminary systems based on stylistic features demonstrate that (1) the new features alone carry significant speaker information; (2) they also carry significant complementary information compared to the conventional features; and (3) they provide increasing improvements in performance with increasing test durations.
منابع مشابه
شبکه عصبی پیچشی با پنجرههای قابل تطبیق برای بازشناسی گفتار
Although, speech recognition systems are widely used and their accuracies are continuously increased, there is a considerable performance gap between their accuracies and human recognition ability. This is partially due to high speaker variations in speech signal. Deep neural networks are among the best tools for acoustic modeling. Recently, using hybrid deep neural network and hidden Markov mo...
متن کاملModulation features for noise robust speaker identification
Current state-of-the-art speaker identification (SID) systems perform exceptionally well under clean conditions, but their performance deteriorates when noise and channel degradations are introduced. Literature has mostly focused on robust modeling techniques to combat degradations due to background noise and/or channel effects, and have demonstrated significant improvement in SID performance i...
متن کاملEffect of Gender on Improving Speech Recognition System
Speech is the output of a time varying excitation excited by a time varying system. It generates pulses with fundamental frequency F0. This time varying impulse trained as one of the features, characterized by fundamental frequencyF0and its formant frequencies. These features vary from one speaker to another speaker and from gender to gender also. In this paper the effect of gender on improving...
متن کاملEffect of Gender on Improving Speech Recognition System
Speech is the output of a time varying excitation excited by a time varying system. It generates pulses with fundamental frequency F0. This time varying impulse trained as one of the features, characterized by fundamental frequencyF0and its formant frequencies. These features vary from one speaker to another speaker and from gender to gender also. In this paper the effect of gender on improving...
متن کاملASR Dependent Techniques for Speaker Recognition
This thesis is concerned with improving the performance of speaker recognition systems in three areas: speaker modeling, verification score computation, and feature extraction in telephone quality speech. We first seek to improve upon traditional modeling approaches for speaker recognition, which are based on Gaussian Mixture Models (GMMs) trained globally over all speech from a given speaker. ...
متن کامل